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Abstract. We describe a search for the extreme-mass-ratio inspiral sources in 
the Round IB Mock LISA Data Challenge data sets. The search algorithm 
is a Monte-Carlo search based on the Metropolis-Hastings algorithm, but also 
incorporates simulated, thermostated and time annealing, plus a harmonic 
identification stage designed to reduce the chance of the chain locking onto 
secondary maxima. In this paper, we focus on describing the algorithm that 
we have been developing. We give the results of the search of the Round IB 
data, although parameter recovery has improved since that deadline. Finally, 
we describe several modifications to the search pipeline that we are currently 
investigating for incorporation in future searches. 
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1. Introduction 

The inspirals of stellar mass compact objects into supermassive black holes in the 
centres of galaxies — extreme-mass-ratio inspirals (EMRIs) — are one of the most 
exciting potential sources of gravitational waves for the planned Laser Interferometer 
Space Antenna (LISA). The detection of such sources in the LISA data stream and 
parameter estimation for them is a very challenging technical problem, however, as 
the instantaneous amplitude of a typical signal is an order of magnitude below the 
noise fluctuations in the detector. Moreover, the long duration of the signals (LISA 
will detect up to 10 5 waveform cycles in an observation) and the large parameter 
space of possible sources (an EMRI signal depends on fourteen parameters) makes 
fully-coherent matched filtering computationally impossible pQ. 

Several possible algorithms have been considered for EMRI detection. In order 
to compute initial event rate estimates, a semi-coherent algorithm was proposed, in 
which the data stream would be divided up into short (two or three week) segments, 
that would be searched coherently via matched filtering. The signal-to-noise ratio 
(SNR) would then be built up in a second stage via incoherent summation of power 
along trajectories through the coherent segments [JJ. This algorithm was designed 
to make full use of available computing power and its effectiveness has not yet been 
demonstrated practically. Time-frequency techniques have also been explored ([2 |6J) 
and have been shown to be able to both detect and recover parameters [Hj when used 
to search for single, high-SNR EMRIs in instrumental noise. Their effectiveness is 
likely to be significantly reduced when confronted with more realistic situations, in 
which there are multiple sources overlapping in time and frequency. However, these 
algorithms may provide a useful first step in a hierarchical search. 

Markov Chain Monte Carlo (MCMC) methods may provide a way to perform 
matched filtering more efficiently, without requiring a large template bank of possible 
signals. We follow the convention in the literature and call our search a Metropolis- 
Hastings Monte Carlo (MHMC) search rather than an MCMC search since it is not 



actually Markovian (see Section |2_2| . Both MCMC and MHMC methods have been 
explored by various groups in the context of LISA data analysis and have been shown 
to be very effective when searching for toy models [7J[S], for white-dwarf binaries [51110) 
and for single or multiple supermassive black hole binaries ([H]-[IS]). The use of an 
MCMC technique for EMRI searches was explored by Stroeer et al. [HI], based on 
a highly simplified model of the EMRI waveform. In the context of the Mock LISA 
Data Challenges (MLDCs) [T7] . monte carlo methods have been used by ourselves 
and one other group [TS] to search for the EMRI sources in the Round 2 [Tj5] and 
Round IB [5D] data sets. In this paper, we describe the algorithm that we have 
developed for the EMRI searches, and the performance of the algorithm on the Round 
IB challenge data sets. Our search code was adapted from the search code developed 
by Cornish and Porter [Til H31 HH HI] for supermassive black hole binary searches, 
but we have incorporated a significant number of refinements that are specific to the 
EMRI problem. 

The paper is organised as follows. In Section[2]we describe the search algorithm, 
including a description of the waveform model, the Metropolis-Hastings search engine 
and various refinements we have tried for the EMRI problem. In Section [3] we present 
the results that we had at the time of the MLDC Round IB deadline (December 
2007) and compare these to the true source parameters. Finally, in Section [2] we 
discuss planned future refinements of the search algorithm. 
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2. Search algorithm 

2.1. Waveform model 

The EMRI sources in the MLDC releases to date were constructed using the analytic 
kludge (AK) model of Barack and Cutler [2T]. In this model, the gravitational 
waveforms describe emission from a Keplerian orbit, but with perihelion and orbital- 
plane precessions imposed by precessing the observer about the source with rates 
taken from post-Newtonian expressions. The orbital parameters are also evolved over 
time using post-Newtonian prescriptions to account for radiation reaction. The EMRI 
waveforms have emission at multiple frequencies, corresponding to harmonics of the 
fundamental frequencies of the orbit — harmonics of the orbital frequency, v, arise 
from the eccentricity of the orbit, e; harmonics of the perihelion precession rate, A f/27r, 
arise from this precession; and harmonics of the orbital-plane precession rate, a/2ir, are 
present due to the inclination, A, of the orbital plane relative to the equatorial plane of 
the black hole. The frequency of a given waveform harmonic is given by three integers, 
(n, I, k), as / = nv + l^/ln + ka/2n. The AK waveforms are purely quadrupole in 
nature, and so |Z|, |fc| < 2, but n is unrestricted. However, the eccentricity at plunge of 
the MLDC sources is limited to [0.15, 0.25], and so harmonics with n > 6 are generally 
weak. 

In our search code, to speed up waveform evaluation, we use a truncated version of 
the AK model. We include only the n < 5 and I — 2 harmonics in the waveform model 
(harmonics with I ^ 2 are significantly suppressed relative to the / = 2 harmonics). 
In addition, we expand the Bessel functions that appear in the model [21J in powers 
of eccentricity, and keep only the three leading terms in the expansion of J n (ne) for 
each n. We include the full LISA TDI response function to account for detector 
motions, and use parameters evaluated at plunge to characterize the waveforms (this 
is in contrast to the MLDC convention, which is to specify parameters at the start 
of the observation). The resulting waveforms are faithful approximations to the full 
AK waveforms. The overlap between an AK and truncated waveform, evaluated at 
the same waveform parameters, is at worst 90%, and is typically 95%. The overlap 
tends to be higher for sources with higher mass central black holes, for which the 
emission is mostly at lower frequencies (for M ~ 1O 7 M the overlap always exceeds 
96%). The template parameter error, i.e., the difference between the parameters of 
the best-fit truncated waveform and the parameters of the true AK waveform, is also 
relatively small. In Table [l] we list the parameter errors for one of the MLDC Round 2 
training sources. These results were obtained by starting an MCMC chain at the true 
parameter values and allowing it to evolve to evaluate the posterior. The difference 
between the mean of the recovered posterior and the true parameter values provided 
an estimate of the bias in our truncated model. We note that since the data stream 
we were searching included noise we expected and saw a noise-induced bias in the 
parameter estimation. However, we could distinguish this noise bias from the model 
errors. The parameter offset values in Table [T] are typical for MLDC type sources. The 
parameters are the same as those used for the MLDC — compact object mass (m), 
central black hole mass (M), initial orbital frequency (vq), luminosity distance (Dl), 
initial eccentricity (eo), central black hole spin (x), ecliptic latitude and longitude 
(/?, 0s), orientation of central black hole spin (Ok, 4>k), orbital inclination (A) and 
three initial orbital phases (<f>o, 70j &o)- For most parameters, the error is at most 
2a, where a is the noise-induced uncertainty in the parameter, as estimated from the 
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Parameter 


ln(m/M Q ) 


ln(M/M Q ) 


ln(^ /Hz) 


D L /Gpc 


eo 


X 


cos(A) 


Error 


0.006% 


0.0008% 


0.00004% 


7% 


0.07% 


0.006% 


0.06% 


Parameter 


cos(/3) 


4>s 


cos(e K ) 


4>K 


$ 


7o 


a 


Error 


0.2% 


0.06% 


0.06% 


0.3% 


3% 


3% 


0.5% 



Table 1. Percentage difference between the parameters of the best-fit truncated 
waveform and the parameters of the true AK waveform. The true AK waveform is 
the 1.3.2 training waveform from MLDC Round 2, for which the parameters are 
ln(m/M ) = 2.3338, ln(M/M Q ) = 15.421, ln(j/ /Hz) = -7.9399, D L /Gpc = 
0.80533, e = 0.18765, \ = 0.68497, cos(A) = -0.43960, cos(/3) = 0.96737, 
<t>S = 3.6238, cos(e K ) = -0.55434, 4> K = 4.3216, <J> = 3.3913, 70 = 6.1502 
and a = 3.2400. 



width of the posterior. The error is somewhat larger for the initial phase angles, as 
the effect of the truncation accumulates over the observation, but these parameters 
are uninteresting astrophy sic ally. Overall, the truncated model provides an estimate 
of the parameters that is sufficiently close to the true parameters to ensure that a 
follow-up refinement with full AK waveforms would be quick. 

2.2. Metropolis-Hastings Monte Carlo algorithm 

Our search engine is based in the Metropolis-Hastings algorithm, which works as 
follows: Given a data set s(t) and a set of templates h(t;x), we randomly choose a 
starting point, x, in the parameter space. We then propose a jump to another point, 
y, in the space by drawing from a certain proposal distribution, q(y\x) (see below), 
and evaluate the Metropolis-Hastings ratio 

H = n{y)p{s\y)g{x\y) ^ 

n(x)p(s\x)q(y\x)' 

Here tt(x) are the priors of the parameters, which, in our analysis, were taken to be 
uniform distributions within the ranges allowed by the MLDC. The function p(s\x) is 
the likelihood 

p(s\x) = Ce-<> s - h ^ s - h(s »/ 2 , (2) 

where C is a normalization constant. This jump is then accepted with probability 
a = min(l, H), otherwise the chain stays at x. 

If the proposal distribution was independent of the step number, the algorithm 
would be Markovian. However, the MHMC algorithm we employ is not, since we 
use a variety of purposely directed proposal distributions that allow a range of 
jumps of different size and type in the parameter space. We also implement several 
annealing schemes to encourage movement of the chain and use time sliding to search 
automatically over the plunge time (which we use as a parameter instead of the initial 
frequency, We are using the MHMC algorithm as a search code to find the 

unknown parameters of the signal. Making the search non-Markovian allows more 
rapid convergence to the source parameters, at the expense of no longer being able to 
construct the posterior from the chain state distribution. In a future implementation 
of the pipeline, we will include a final Markov Chain stage to recover the posterior once 
the source parameters have been approximately identified. This was not implemented 
for the searches described here. 
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2.3. Proposal distributions 

Our primary proposal distribution is a multi-variate Gaussian, which is constructed as 
a product of Gaussian distributions in each eigendirection of the Fisher Information 
Matrix (FIM), Ty. The distribution in each eigendirection is taken to have zero 
mean and a standard deviation of o~i — 1/y/DEi. Here D is the dimensionality 
of the search space (in this case, D — 13, as we use normalized search templates) 
and Ei is the corresponding eigenvalue of V^ u . The FIM is computed numerically, 
using the same truncated AK model that we employ as the search template, but with 
the additional simplification that the detector response is modelled using the low- 
frequency approximation (as in the original AK paper [H]), rather than via the full 
TDI response. 

While the FIM based proposal is used at most steps, we also periodically draw a 
proposed point from one of several other proposal distributions: 

• Scaled uniform jumps — the proposal distribution is taken to be uniform within 
the priors for each parameter. This forces the chain to explore other, widely 
separated, regions of the parameter space. 

• Scaled FIM jumps — this proposal is based on the FIM as for the standard 
proposal, but the proposed jumps are artificially reduced in size by a factor of 
ten. These proposals are almost always accepted, forcing the chain to move 
slightly away from secondaries and hence encouraging movement. 

• Antipodal sky position — at low frequencies, the sky positions (/?, <ps) and 
(— /3, 4>s ± 7r) are equivalent in terms of the detector response. In searches for 
black hole binaries, it was found that the chains could often become locked on 
the wrong sky position, so we include a proposal that moves the chain to the 
antipodal sky position to avoid this problem. 

• Intrinsic/ extrinsic/phase only jumps — the waveforms can be divided into 
intrinsic parameters (m, M, vq, eo, Xi tyi extrinsic parameters (Al, /3, 4>s, 
Ok, 4>k) and phase offsets (<&o, 7o, «o)- Intrinsic parameters affect the frequency 
and phase evolution of the different harmonics, while extrinsic parameters only 
affect how this is projected into a detector response, and the phase offsets only 
define the relative phase of the different harmonics at one fiducial time. Proposing 
jumps in extrinsic/phase offset parameters only was designed to improve the fit 



while keeping the harmonic frequencies fixed (see Section 2.5) 



We note that no matter which proposal we use, all the waveform parameters are 
updated, i.e., we do not use Gibbs sampling to update one parameter at a time. 
Typically, we draw from each of the alternative proposals every few tens of chain 
points, but these sampling frequencies are tunable parameters which we have not yet 
optimised for the EMRI search, and so we do not quote specific numbers here. 



2.4- Simulated, Thermostated and Time Annealing. 

As the likelihood surface for EMRIs is full of secondary maxima, it is very easy for 
the search chains to get stuck. To ensure acceptable movement in the chain, we use 
simulated annealing to heat the likelihood surface [13],I15|. This smooths and flattens 
any bumps on the likelihood surface, which ensures the chain has a greater chance of 
escaping secondaries and moving uphill towards the primary peak. In the definition 
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of the likelihood, Eq. ([2]), there is a factor of 2 in the denominator of the exponent. 
We replace this factor with a "heat" parameter 0, defined by 



where £ > is the heat-index defining the initial heat, i is the step number of the chain 
and T c is the cooling schedule (i.e., the number of chain steps over which the cooling 
takes place). As the success of the choice of initial heat is only known a posteriori, we 
are still investigating the optimal heating scheme. 

To mitigate potential problems, we also employed a thermostated heating scheme 
as used in [TS] . We define 



and take the heat to be the maximum of S and defined by Eq. ^j. The thermostated 
heating scheme pumps heat into the system once we reach SNR = p c . This should 
make the chain move uphill faster, once we have begun to match the signal. This 
scheme is run in conjunction with the simulated annealing. While we continue to 
calculate both heats, we always use the heat that is highest in the search. For this 
study we chose p c = 15. 

The final annealing scheme we employ is time annealing, which is similar to the 
frequency annealing described in Ref . [T5] . The cost of evaluating the likelihood and 
the FIM for the proposal distribution depends on the length of the waveform template 
being used. It is inefficient to use full length (two year) templates initially, when the 
parameter values are poorly constrained. We therefore start off with shorter templates 
and increase the length of the template, tdur, as the chain progresses 



where i is the number of steps in the chain, Nf is the total number of iterations in the 
time annealing chain, £ m j n is the minimum length of template and t max is the maximum 
length of the template. The aim of the time annealing is to use shorter templates while 
the chain is taking big jumps exploring the parameter space, and then to increase the 
template length and refine the parameter determination once the chain has settled 
in the vicinity of the true parameters. For this analysis, we used t m { n = 4 months 
to ensure a reasonable SNR (> 20) in the shortest templates employed, and we took 
Nf — 10,000 and i max = lyr, half the length of the MLDC data release. 

In our implementation of the search, we started by running a 5,000 iteration 
chain with t& m fixed at four months, but using simulated annealing and thermostated 
heating. After 5, 000 iterations we began a 10,000 iteration time-annealing scheme, 
using thermostated heating in conjunction but not simulated annealing. Once 
the time-annealing phase was complete, we cooled the surface down over 5,000 
iterations during a final simulated annealing phase according to the scheme detailed 
by Equation (J3j without thermostated heating. We then allowed the chain to run as 
a standard MCMC with unit heat for 80,000 iterations to obtain a final refinement of 
the parameters. 




(3) 
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2.5. Harmonic identification 

The EMRI likelihood space is very complicated and is characterized by many secondary 
maxima. It is consequently very easy for the chain to get stuck on a secondary. This 
was a major concern in our analysis of the MLDC Round 2 data, and again in the 
initial analysis of the Round IB data. A chain started at a random point would 
lock very quickly onto a secondary of the signal, but would then not move far away 
from that point. However, chains seeded at different initial points would lock onto 
different secondaries. A secondary typically has one or two harmonics that overlap 
with harmonics of the true signal for a section of the data stream, and we can use that 
information to determine the true parameters. Given the parameters of a secondary, 
one can construct the cumulative overlap (in the frequency domain) of each harmonic 
of the secondary with the data stream. If the overlap is significant, the frequency range 
in which the overlap accumulates indicates the section of that secondary harmonic 
which matches a harmonic of the true signal. This is illustrated in the top panel of 
Fig-B 

In order to exploit this information, as a preliminary stage of the search we ran 
several (~ 20) chains, seeded at different points in parameter space, stopped the chains 
after ~ 1000 steps and then analysed the harmonic content of the highest SNR point 
found in each chain. Different chains would typically find different harmonics, and so 
combining this information allows a picture of the signal structure to be constructed. 
This is illustrated in the bottom panel of Fig. [I] This information can be used in the 
search in several ways — as a veto of secondaries (any parameter space point that does 
not match these harmonics cannot be the primary maximum); to fold into parameter 
priors (reject proposed steps that move harmonics too far away from these tracks); or 
as a constraint when proposing steps for the chain. 

2.6. Constrained jumps 

It is possible to modify the proposal distribution to ensure certain constraints are 
satisfied, e.g., the frequency of a given harmonic at a given time. Given a set of 
constraints, {fi(x) = 0} for i = 1, • • • , N, we can define a set of unit vectors orthogonal 
to the constraint surfaces, el — V/»/|V/i|, and decompose a step in parameter space, 
Sx, into a piece that maintains the constraints, 8x, and a piece in the space spanned 
by the el, Sx — Sx + aiel. The template mismatch, M., at the new point for a jump 
that maintains the constraints is therefore given in terms of the FIM, , by 

2M = TijSx^xi = f ij Sx i Sx j 

Czfc e fe e r + r„ m C/fcC P9 efce^e™e™) 5x' l Sx j (6) 

in which superscripts indicate vector components, Cy = (A^ 1 )^ for Aij = e~l ■ e~j and 
repeated indices are summed as usual. The matrix can be used in place of Tij to 
generate proposed steps that maintain the constraints. The matrix has N zero 
eigenvalues with corresponding eigenvectors {el}, which are ignored when constructing 
the proposed step. 

We implemented constrained jumps in the search, using the results of the 
harmonic analysis described in Section |2.5| to determine the frequencies of several 
harmonics and their rates of change at a specified time. In practice, we found it best 
to use the frequencies of three harmonics, and the time derivative of the dominant 
harmonic as the constraints. Additional frequencies or derivatives did not provide any 
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Figure 1. Top panel Cumulative overlap of a given harmonic with the data 
stream, as a function of frequency. The curves show the overlap of the A and 
E TDI data streams respectively. The overlap accumulates when the harmonic 
frequency is 4.5mHz < / < 5.5mHz, which indicates the time range where it 
matches a harmonic of the true signal. Bottom panel True harmonics (lines) and 
identified harmonics (crosses) for the MLDC Round IB training source IB. 3. 2. 
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extra information. For the AK model, it is easy to translate such constraints into 
parameter values and hence reduce the six-dimensional intrinsic parameter space — 
M, m, S, A, eo, — to a two-dimensional space spanned by eg and A. We actually 
used this dimensional reduction in our implementation of the search, although the 
general expression (JsJ) will be necessary for a generic waveform model. 

The constrained search was able to rapidly improve the SNR from the initial guess 
and if the constraints were specified exactly (using the MLDC training data), the chain 
moved steadily to the correct point. However, the harmonic analysis was not able in 
general to determine precise values for the frequencies. An error of ~ 10~ 7 Hz in one 
frequency leads to a dephasing after ~ 4 months, which limits the usefulness of the fully 
constrained search. We found it more effective to use a partially constrained search, 
i.e., using the harmonic analysis to estimate frequencies and frequency derivatives at a 
certain time with estimated uncertainties. It is possible to rcparamctcrisc the waveform 
in terms of the values of the three fundamental frequencies at a certain time, plus the 
time derivative of the dominant harmonic at the same time. An MHMC chain can 
then be run on this alternative parameter space, using tight priors on those frequencies 
assigned using the frequency uncertainty estimates. This can also be done directly in 
the physical parameter space by using the constrained FIM, and additionally 
allowing small jumps in the directions el orthogonal to the constraint surface. The 
results of one partially constrained search are illustrated in Figure [2j This type of 
search was found to be much more effective, although our initial implementation of 
the proposal was inefficient which led to a low acceptance rate. 

3. Round IB Results 

At the time of the MLDC submission deadline (December 2007), we were still 
developing the details of the search outlined above, and therefore just submitted the 
maximum a posteriori (MAP) parameters that had been found in the search by that 
time. We used the MAP values as we were sure at this stage that we were either still 
moving towards the true solution, or stuck on a secondary. It therefore made no sense 
to use the mean of a posterior to determine the best fit paramteres. Our strategy 
involved starting 10 chains at different random start points for each challenge. Due to 
time constraints we needed to stop the MCMC chains after between 35,000 and 40,000 
iterations out of 80,000. We used the MAP values from each of these chains as starting 
points for the harmonic identification and constrained stages of the search. The initial 
MHMC runs took 2-3 weeks each, the harmonic identification was then quick (a few 
hours), but the final constrained stage also took several weeks. We expect these run 
times to decrease as the algorithm is refined, but a final run time of ~ 1 week is a 
reasonable guess. We should mention that the search of Challenge IB. 3. 2 was the 
most advanced at the time of submission, and this was the only source for which we 
had begun the constrained jump phase of the search by that time. 

A harmonic analysis indicated that we were in the vicinity of secondary maxima 
for sources IB. 3.1, IB. 3. 3, IB. 3. 4 and IB. 3. 5. The harmonics of our best guess 
parameters for IB. 3. 2 appeared to be approximately in the right place, although 
the total SNR of this point did not lie within the MLDC prior, indicating that the 
parameters were not completely correct. The submitted results are summarised in 
Tables [2f|3] along with the true values of the Challenge parameters. 

As expected, our results for IB. 3. 2 agreed quite well with the true parameters, 
but in the other cases we were quite far off. The recovered parameters correspond to 
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Figure 2. SNR as a function of chain index (top-left); frequency of dominant 
harmonic vs. SNR (top-right); rate of change of frequency of dominant harmonic 
vs. SNR (bottom-left); and sky longitude vs. SNR (bottom-right), in a partially 
constrained search for the MLDC Round IB. 3. 2 challenge source. The chain is 
moving steadily towards the true parameter values, indicated by horizontal lines. 
This run did not employ either simulated annealing or time-annealing. 



secondaries of the likelihood in all cases, which share harmonics with the true signal. 
For all sources except IB. 3.1, we appear to have locked onto the antipodal sky location, 
despite the inclusion of a sky position flip as one of the proposal distributions. We 
hope to avoid this in the future by implementing delayed rejection [18 j which will be 
discussed in Section [4] 

4. Summary and future plans 

Our final search algorithm for identifying isolated EMRIs in instrumental noise was 
as follows: 1) run a set of chains without annealing for a small number of chain steps; 
2) identify harmonics of the true signal from the highest SNR points found in the 
preliminary chains; 3) run a partially constrained search, including temperature and 
time annealing, to refine the source parameters. The search runs were incomplete at 
the time of the MLDC deadline, but subsequent runs have moved closer to what we 
now know are the true parameters. The convergence rate is slow, however, so we are 
currently exploring several improvements to the algorithm, including 
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Source 


S/M 2 


m/M 


Af/(10 6 M Q ) 


z^o(mHz) 


eo 


A 


1B.3.1 


True 


0.69816 


10.296 


9.5180 


0.19204 


0.21438 


0.43946 




Recovered 


0.63624 


10.500 


10.359 


0.18711 


0.15810 


0.50800 


1B.3.2 


True 


0.63796 


9.7711 


5.2156 


0.34228 


0.20791 


1.4358 




Recovered 


0.63971 


9.7751 


5.2076 


0.34223 


0.20941 


1.4399 


1B.3.3 


True 


0.53326 


9.6973 


5.2197 


0.34257 


0.19927 


0.92822 




Recovered 


0.59655 


10.193 


5.2344 


0.34236 


0.19647 


0.75882 


1B.3.4 


True 


0.62514 


10.105 


0.95580 


0.85144 


0.45058 


1.6707 




Recovered 


0.63104 


10.085 


1.0439 


0.79510 


0.44077 


2.1837 


1B.3.5 


True 


0.65830 


9.7905 


1.0334 


0.83218 


0.42691 


2.3196 




Recovered 


0.67701 


9.8849 


0.97872 


0.83390 


0.42950 


2.5092 



Table 2. Comparison of best recovered parameters at time of MLDC submission 
to true parameters. 



Source 


P 


4>s 


0k 


4>K 


1B.3.1 


True 


0.55258 


4.9104 


1.7625 


2.0472 




Recovered 


0.64558 


4.9473 


2.2005 


1.7141 


1B.3.2 


True 


0.35970 


4.6826 


3.0372 


4.0382 




Recovered 


-0.87434 


1.1145 


1.5836 


6.0901 


1B.3.3 


True 


0.98166 


0.70967 


1.5364 


4.1487 




Recovered 


-0.69482 


3.8708 


2.2751 


3.8650 


1B.3.4 


True 


-0.98020 


0.97873 


1.7601 


4.1164 




Recovered 


0.81338 


3.2806 


0.60980 


6.0855 


1B.3.5 


True 


-1.1092 


1.0876 


0.84039 


5.7564 




Recovered 


0.39053 


1.9690 


1.9561 


5.1293 



Table 3. Comparison of best recovered parameters at time of MLDC submission 
to true parameters — sky position and source orientation. 

• Fast waveform model. By using the low-frequency approximation to the detector 
response, and interpolation of the barycentre waveform, it is possible to speed up 
the waveform and FIM evaluations which are the bottlenecks in the current code. 

• Semi-coherent analysis. As considered for template based searches in pQ, by 
dividing the data stream into sections of a few months in length, and searching 
these separately in parallel, the search speed is increased. The main difficulty is 
forcing parameter consistency between the different segments. 

• Delayed rejection. This is a technique whereby the chain is forced to accept a 
(large) jump in the parameter space, but the likelihood at the original point, xl, is 
recorded. The chain is then allowed to run (with small jumps) for a pre-specified 
number of steps, e.g., 100, to reach a final point, x), before the Metropolis- 
Hastings ratio is evaluated using Xi and x). If this step is rejected, the chain 
goes back to Xi and another jump is proposed. This technique can help the chain 
jump to widely separated secondaries in the parameter space, and it might also 
encourage the chain to switch to the correct antipodal sky position, as mentioned 
earlier. 

• Numerical F-statistic It is relatively straightforward to construct a proposal that 
changes the intrinsic parameters of the source in order to achieve a rotation of 
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the signal harmonics, i.e., to keep the dominant harmonic in the same place, but 
change the indices (n, I, k) of that harmonic. However, it will usually be necessary 
to change the extrinsic parameters and phase angles in order to find a high SNR 
at the new point. This can be accomplished by proposing a jump in the intrinsic 
parameters, and then finding the best-fit extrinsic parameters at the new point, 
either by using a template grid or by using a mini (~ 100 iteration) monte carlo 
chain that explores only the extrinsic parameters. The likelihood maximized over 
extrinsic parameters can then be used to evaluate the Metropolis-Hastings ratio 
for the proposed jump in intrinsic parameters. 

A short-coming of the algorithm described here will become apparent when the 
data stream becomes more complex and contains multiple sources, as in MLDC Round 
3. When multiple sources are present, distinguishing between a weak harmonic that 
is a side-band of an identified bright source and one that is the dominant harmonic 
of a second, weaker signal, is difficult. The location in time and frequency, and the 
shape of the track, will be useful diagnostics, but some confusion will be inevitable. 
As an alternative to identifying harmonics to use as constraints, an understanding 
of how harmonic rotations/shifts can be achieved by parameter changes should allow 
the formulation of proposal distributions that will move the chain from one secondary 
to another (similar to the "island hopping" used in supermassive black hole binary 
searches |15j). The aim of such a proposal would be to make the chain jump between 
points in parameter space that had harmonics in common with the true signal and 
hence the chain should find the true parameters more quickly. Such proposals will be 
explored when we apply these algorithms to the Round 3 data. 

Acknowledgments 

JG acknowledges support from the Royal Society and thanks the Albert Einstein 
Institute for hospitality and support while this work was being completed. The work 
of SB and EKP was supported by DLR (Deutsches Zentrum fur Luft- und Raumfahrt). 
LB acknowledges support from PPARC/STFC through grant number PP/D00111011. 

References 

[1] Gair J R, Barack L, Creighton T, Cutler C, Larson S L, Phinney E S & Vallisneri M, Class. 

Quant. Grav. 21 S1595 (2004). 
[2] Wen L & Gair J R, Class. Quantum Grav. 22 S445 (2005). 
[3] Gair J R & Wen L, Class. Quantum Grav. 22 S1359 (2005). 
[4] Wen L, Chen Y & Gair J R, AIP Conf. Proc. 873 595 (2006). 
[5] Gair J R & Jones G, Class. Quantum Grav. 24 1145 (2007). 

[6] Gair J R, Mandel I & Wen L, Journal of Physics Conf. Proc., accepted. Preprint arXiv:0710.5250 
(2007). 

[7] Andrieu C and Doucet A, IEEE Trans. Signal Process. 47 2667 (1999). 

[8] Umstatter R, Christensen N, Hendry M, Meyer R, Simha V, Veitch J, Vigeland S & Woan G, 
Phys. Rev. D 72 022001 (2005). 

[9] Cornish N J & Crowder J, Phys. Rev. D 72 043005 (2005). 
[10] Crowder J & Cornish N J, Phys. Rev. D 75 043008 (2007). 
[11] Cornish N J & Porter E K, Class. Quant. Grav. 23 S761 (2006). 
[12] Wickham E D L, Stroeer A & Vecchio A, Class. Quant. Grav. 23 S819 (2006). 
[13] Cornish N J & Porter E K, Phys. Rev. D 75 021301 (2007). 
[14] Cornish N J & Porter E K, Class. Quant. Grav. 24 S501 (2007). 
[15] Cornish N J & Porter E K, Class. Quant. Grav. 24 5729 (2007). 
[16] Stroeer A, Gair J R & Vecchio A, AIP Conf. Proc, 873 444 (2006). 



A Constrained MHMC search for EMRIs in the MLDC Round IB 



13 



[17] Arnaud K A et al., AI P Conf. Proc, 8 73 619 (2006). 
[18] Cornish N J, Preprint |arXiv:0804.3323| (2008) 

[19] Babak S, Baker J G, Benacquista M J, Cornish N J, Crowder J, Cutler C, Larson S L, 
Littenberg T B, Porter E K, Vallisneri M & Vecchio A, Proceedings of the 7th Amaldi 
Conference on GWs, July 2007, Sydney, Australia. |arXiv:0711.2667l (2007) 

[20] The MLDC taskforce, Proceedings of the 12th GWDAW, December 2007, Boston, USA. 

[21] Barack L & Cutler C, Phys. Rev. D 69 082005 (2004). 



